Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 25(1): 158, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643066

RESUMO

BACKGROUND: Motif finding in Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) data is essential to reveal the intricacies of transcription factor binding sites (TFBSs) and their pivotal roles in gene regulation. Deep learning technologies including convolutional neural networks (CNNs) and graph neural networks (GNNs), have achieved success in finding ATAC-seq motifs. However, CNN-based methods are limited by the fixed width of the convolutional kernel, which makes it difficult to find multiple transcription factor binding sites with different lengths. GNN-based methods has the limitation of using the edge weight information directly, makes it difficult to aggregate the neighboring nodes' information more efficiently when representing node embedding. RESULTS: To address this challenge, we developed a novel graph attention network framework named MMGAT, which employs an attention mechanism to adjust the attention coefficients among different nodes. And then MMGAT finds multiple ATAC-seq motifs based on the attention coefficients of sequence nodes and k-mer nodes as well as the coexisting probability of k-mers. Our approach achieved better performance on the human ATAC-seq datasets compared to existing tools, as evidenced the highest scores on the precision, recall, F1_score, ACC, AUC, and PRC metrics, as well as finding 389 higher quality motifs. To validate the performance of MMGAT in predicting TFBSs and finding motifs on more datasets, we enlarged the number of the human ATAC-seq datasets to 180 and newly integrated 80 mouse ATAC-seq datasets for multi-species experimental validation. Specifically on the mouse ATAC-seq dataset, MMGAT also achieved the highest scores on six metrics and found 356 higher-quality motifs. To facilitate researchers in utilizing MMGAT, we have also developed a user-friendly web server named MMGAT-S that hosts the MMGAT method and ATAC-seq motif finding results. CONCLUSIONS: The advanced methodology MMGAT provides a robust tool for finding ATAC-seq motifs, and the comprehensive server MMGAT-S makes a significant contribution to genomics research. The open-source code of MMGAT can be found at https://github.com/xiaotianr/MMGAT , and MMGAT-S is freely available at https://www.mmgraphws.com/MMGAT-S/ .


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Genômica , Humanos , Animais , Camundongos , Sítios de Ligação , Ligação Proteica , Genômica/métodos , Cromatina/genética , Fatores de Transcrição/metabolismo
2.
Brain Sci ; 13(2)2023 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-36831850

RESUMO

Deep learning has shown impressive diagnostic abilities in Alzheimer's disease (AD) research in recent years. However, although neuropsychological tests play a crucial role in screening AD and mild cognitive impairment (MCI), there is still a lack of deep learning algorithms only using such basic diagnostic methods. This paper proposes a novel semi-supervised method using neuropsychological test scores and scarce labeled data, which introduces difference regularization and consistency regularization with pseudo-labeling. A total of 188 AD, 402 MCI, and 229 normal controls (NC) were enrolled in the study from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. We first chose the 15 features most associated with the diagnostic outcome by feature selection among the seven neuropsychological tests. Next, we proposed a dual semi-supervised learning (DSSL) framework that uses two encoders to learn two different feature vectors. The diagnosed 60 and 120 subjects were randomly selected as training labels for the model. The experimental results show that DSSL achieves the best accuracy and stability in classifying AD, MCI, and NC (85.47% accuracy for 60 labels and 88.40% accuracy for 120 labels) compared to other semi-supervised methods. DSSL is an excellent semi-supervised method to provide clinical insight for physicians to diagnose AD and MCI.

3.
Entropy (Basel) ; 25(1)2023 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-36673247

RESUMO

Feature detection and correct matching are the basis of the image stitching process. Whether the matching is correct and the number of matches directly affect the quality of the final stitching results. At present, almost all image stitching methods use SIFT+RANSAC pattern to extract and match feature points. However, it is difficult to obtain sufficient correct matching points in low-textured or repetitively-textured regions, resulting in insufficient matching points in the overlapping region, and this further leads to the warping model being estimated erroneously. In this paper, we propose a novel and flexible approach by increasing feature correspondences and optimizing hybrid terms. It can obtain sufficient correct feature correspondences in the overlapping region with low-textured or repetitively-textured areas to eliminate misalignment. When a weak texture and large parallax coexist in the overlapping region, the alignment and distortion often restrict each other and are difficult to balance. Accurate alignment is often accompanied by projection distortion and perspective distortion. Regarding this, we propose hybrid terms optimization warp, which combines global similarity transformations on the basis of initial global homography and estimates the optimal warping by adjusting various term parameters. By doing this, we can mitigate projection distortion and perspective distortion, while effectively balancing alignment and distortion. The experimental results demonstrate that the proposed method outperforms the state-of-the-art in accurate alignment on images with low-textured areas in the overlapping region, and the stitching results have less perspective and projection distortion.

4.
Nat Genet ; 45(1): 51-8, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23179023

RESUMO

Watermelon, Citrullus lanatus, is an important cucurbit crop grown throughout the world. Here we report a high-quality draft genome sequence of the east Asia watermelon cultivar 97103 (2n = 2× = 22) containing 23,440 predicted protein-coding genes. Comparative genomics analysis provided an evolutionary scenario for the origin of the 11 watermelon chromosomes derived from a 7-chromosome paleohexaploid eudicot ancestor. Resequencing of 20 watermelon accessions representing three different C. lanatus subspecies produced numerous haplotypes and identified the extent of genetic diversity and population structure of watermelon germplasm. Genomic regions that were preferentially selected during domestication were identified. Many disease-resistance genes were also found to be lost during domestication. In addition, integrative genomic and transcriptomic analyses yielded important insights into aspects of phloem-based vascular signaling in common between watermelon and cucumber and identified genes crucial to valuable fruit-quality traits, including sugar accumulation and citrulline metabolism.


Assuntos
Citrullus/genética , Genoma de Planta , Mapeamento Cromossômico , Cromossomos de Plantas , Citrullus/classificação , Biologia Computacional/métodos , Evolução Molecular , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Anotação de Sequência Molecular , Dados de Sequência Molecular , Filogenia , Sequências Repetitivas de Ácido Nucleico , Transcriptoma
5.
PLoS One ; 7(1): e29453, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22247776

RESUMO

As part of our ongoing efforts to sequence and map the watermelon (Citrullus spp.) genome, we have constructed a high density genetic linkage map. The map positioned 234 watermelon genome sequence scaffolds (an average size of 1.41 Mb) that cover about 330 Mb and account for 93.5% of the 353 Mb of the assembled genomic sequences of the elite Chinese watermelon line 97103 (Citrullus lanatus var. lanatus). The genetic map was constructed using an F(8) population of 103 recombinant inbred lines (RILs). The RILs are derived from a cross between the line 97103 and the United States Plant Introduction (PI) 296341-FR (C. lanatus var. citroides) that contains resistance to fusarium wilt (races 0, 1, and 2). The genetic map consists of eleven linkage groups that include 698 simple sequence repeat (SSR), 219 insertion-deletion (InDel) and 36 structure variation (SV) markers and spans ∼800 cM with a mean marker interval of 0.8 cM. Using fluorescent in situ hybridization (FISH) with 11 BACs that produced chromosome-specifc signals, we have depicted watermelon chromosomes that correspond to the eleven linkage groups constructed in this study. The high resolution genetic map developed here should be a useful platform for the assembly of the watermelon genome, for the development of sequence-based markers used in breeding programs, and for the identification of genes associated with important agricultural traits.


Assuntos
Mapeamento Cromossômico , Citrullus/genética , DNA de Plantas/genética , Ligação Genética/genética , Marcadores Genéticos/genética , Genoma de Planta , Hibridização in Situ Fluorescente , Repetições de Microssatélites , Reação em Cadeia da Polimerase
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...